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Abstract: We study the mean and variance of the number of self-intersections of the 
equilateral isotropic random walk in the plane, as well as the corresponding quantities 
for isotropic equilateral random polygons (random walks conditioned to return to their 
starting point after a given number of steps). The expected number of self-intersections is 
(2/7r^)n logn-|-0(n) for both walks and polygons with n steps. The variance is 0(n^ logn) 
for both walks and polygons, which shows that the number of self-intersections exhibits 
concentration around the mean. 


1. Introduction 

The main objects of study in this paper are random walks and polygons in two dimen¬ 
sions. We shall, however, also need to refer to projections onto two dimensions of random 
walks and polygons in higher-dimensional spaces, so we shall begin by dehning these ob¬ 
jects in d-dimensional space. A sequence 0 = Xq, Wi,..., of points in will be called 
a random walk in d-dimensional space if the differences Xi — Wq, X 2 — Xi ,..., Xn — X^-i 
are independent identically distributed random variables in R^. We may also refer to the 
union [Wq, Wi] U [Wi, X 2 ] U ■ ■ ■ [W^_i, W„] of the line segments [Xk-i, Xk] between the suc¬ 
cessive points Xk-i and X^ as the random walk. All of the random walks we study will 
be isotropic; that is, the directions {Xk — Xk-i)/\Xk — Xk-i\ of the steps will always be 
uniformly distributed over the {d — 1)-dimensional unit sphere, independent of the lengths 
\Xk — Xk-i \ of the steps, so that the distribution of a random walk of length n in R'^ can 
be specihed by giving the common distribution of the lengths. A random polygon in R^ is 
a random walk in R'^ conditioned on the event X^ = Xq of returning to the origin after n 
steps. 

A random walk or polygon in R'^ with d > 3 can be projected onto a plane to give 
a random walk or polygon in R^. Since our random walks are isotropic, the distribution 
of a projected walk or polygon will not depend on the choice of the plane onto which it is 
projected. 

We shall be interested in the distribution of the number of self-intersections of a 
random walk or polygon. Because angles are continuously distributed in our models, we 
can ignore the possibility that two points in a random walk coincide, that a point falls on 
a line segment, or that two line segments overlap in an interval of strictly positive length, 
since these events occur with probability zero. When n > 3, the same observation applies 
to random polygons. Thus the self-intersections occur at the interiors of line segments, and 
the number of self-intersections is the number of pairs of distinct line segments [Xk-i, Xk] 
and [Xi-i, Xi], with 1 < k < I < n and I — k > 2. that intersect at an interior point of 
each segment. (In the case of polygons, we also exclude I — k = n — 1.) 

Diao and Ernst [D2] have studied the number of self-intersections of Gaussian random 
walks and polygons, showing that its mean is (l/27r)nlogn -|- 0{n) for both walks and 
polygons. (For a Gaussian random walk, each step has an isotropic multivariate Gaussian 
distribution. The number of self-intersections does not depend on the variance of the steps, 
since this only affects the walk or polygon by a scale factor. And, since the projection of a 
Gaussian random walk onto a smaller number of dimensions is again a Gaussian random 
walk, their result does not depend on the dimension of the original walk.) 
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Diao et al. [Dl] have studied the corresponding problem for the projections onto two 
dimensions of three-dimensional equilateral random walks and polygons, obtaining the 
estimate (3/16)nlogn -|- 0{n) for both walks and polygons. (For an equilateral random 
walk, each step has unit length. The projection of a three-dimensional equilateral walk 
onto two dimensions is not equilateral, so their analysis is done in three-dimensional space.) 

In this paper we study two-dimensional equilateral isotropic random walks and poly¬ 
gons. As might be expected, our result for the mean number of self-intersections differs 
from the results cited above only in the constant factor in the leading term: we show that 
it is (2/7r^)nlogn -|- 0{n) for both walks and polygons. But we carry the analysis further 
than that of the results cited above, and show that the variance is O(n^logn) for both 
walks and polygons. Thus the number of self-intersections exhibits concentration about its 
mean in both cases. Indeed, by Chebyshev’s inequality, the probability that the number of 
self-intersections differs from its mean by more than n(logn)^/^ is at most 0(l/(logn)^/^). 
Finally, we observe that l/27r = 0.1591..., 3/16 = 0.1875, and 2/7r^ = 0.2026.... Thus 
equilateral isotropic random walks and polygons have on the average more self-intersections 
than their counterparts in either of the other models mentioned above. 

2. Quasi-Gaussian Densities 

We shall say that a two-dimensional probability density /i?,e(r, '&) is n-Gaussian if its 
polar coordinates (A, 0) have a density of the form 

= — exp 
Tin 

In this paper we shall often encounter two-dimensional densities that are approximately, 
but not exactly, Gaussian. In this section we shall dehne a suitable notion of “approxi¬ 
mately Gaussian” which we call “quasi-Gaussian”. We shall show that the sum of n steps 
of an equilateral isotropic random walk is quasi-Gaussian. 

We shall say that a two-dimensional probability density /i?,e(r, '&) is n-quasi-Gaussian 
if its polar coordinates (A, 0) have a density of the form 

/H,e(G'«?) = — exp r-—^ +0 (2.1) 

Tin \ n J \n^ J 

where the constant in the O-term is independent of r, d and n. (This dehnition really 
applies to a family of densities parameterized by n, and it only refers to their behavior 
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for large n. It does not require that fR,e{r,'d) be independent of but only that its 
dependence on '& affects the density by at most 0(l/n^).) 

The distribution of the isotropic equilateral random walk was apparently hrst treated 
by Rayleigh [Rl, pp. 35-42] in 1877. This random walk gives the distribution of the 
amplitude and phase of the sum of identical sinusoidal oscillations with equal amplitudes 
and random phases. Rayleigh gave the asymptotic formula 


fnir) 


rsj 


2r 

— exp 
n 
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for the radial density function of this walk after n steps. Since the walk is isotropic, 
dividing by 27rr gives the density 




1 

— exp 
Tin 



•) 


which agrees with the n-Gaussian factor in (2.1). 

In 1906, Kluyver [K] gave the integral representation 


F^{r) = r / Ji{rx) Jq{x)'^ dx 


( 2 . 2 ) 


for the radial distribution function, where Jn{x) is the Bessel function of order n (see 
Watson [Wl]). The representation 


/^(r) = r / Joirx) Jq{xY xdx, 


(2.3) 


for the radial density function can be obtained from (2.2) by differentiating with respect 
to r, then using the identity Jq{x) = —Ji{x) (see Watson [Wl], p. 18) and the differential 
equation xJq{x) + Jq{x) + xJq{x) = 0 (see Watson [Wl], p. 19): 


—rJi(ra:) = —^rf^ir-x) 

— —rxJQirx) — Jo{rx) 
= rxJQirx). 


(For n > 5 the integral is absolutely convergent: we have |Ji(ra:)| < 1 (see Watson 
[Wl, p. 31]) and Jo{x) = OidVjx^l’^') (see Watson [Wl, p. 195]). This fact justihes the 
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differentiation of the integral (see for example Whittaker and Watson [W2, p. 174]).) 
Dividing (2.3) by 27rr yields 


/R,e(n ^ / Mtx) Jo(x)" X dx, 


(2.4) 


In 1919, Rayleigh [R2] gave a henristic derivation of an asymptotic expansion for f^{r) 
that, if proved rigoronsly, wonld show that the eqnilateral random walk is qnasi-Ganssian. 
(Rayleigh’s derivation involves differentiating an asymptotic expansion term-by-term.) We 
shall give a rigorons proof below that the eqnilateral random walk is n-qnasi-Ganssian. 
(Onr proof conld be extended to establish the complete asymptotic expansion, np to terms 
of order 0(l/n^) for any hxed k > 1.) 

Proposition 2.1 The snm of n steps of an eqnilateral isotropic random walk is n-qnasi- 
Ganssian. 

Proof: In view of (2.4), it will snfhce to show that 

[ Jo(ra:) Jo(rr)"^a:da: = — exp ^ -F O (2.5) 

271 Jo Tin \ n J \n^ J 


where the constant in the O-term is independent of both r and n. Let xq = (24 logn/n)^/^. 
Onr hrst step will be to show that 


'Xq 


Jo{tx) Jo{x)^ xdx = O [ ^ . 


Let Xi = n^/^. We shall prove (2.6) by showing that 


( 2 . 6 ) 




Joirx) Jq{x)^ X dx = O 


’Xo 




(2.7) 


and 


Joirx) Joix)'^ X dx 


'Xl 



( 2 . 8 ) 


To prove (2.7), we hrst observe that Joix) is analytic for x G [0, cxd) and Joix) ~ 1 —x‘^/A + 
0(a:^) as a: —)■ 0 (see Watson [Wl, p. 16]), and that Jo(a;) assnmes valnes near 1 only for 
X near 0. (The last fact can easily be seen from a graph of Joix)', we shall indicate how 
it can be derived from facts proved by Watson [Wl], who gives no graphs!) Firstly, the 
integral representation 


1 r 

Joix) = — cos(xcos'd) dtl 

Jo 
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(see Watson [Wl, p. 24]) shows that Jo{x) = 1 only for a: = 0 (becanse it is an average 
of qnantities that are all 1 only if a: = 0). Thns we cannot have Jo{xn) ^ 1 for a:^ —t a: 
for any hnite a: > 0, becanse Jo(ic), being analytic, is continnons. We also cannot have 
Joi^Xn) 1 for Xn —t oo, since Jo{x) ^ 0 as a: —)■ cxd (see Watson [Wl, p. 195]). Thns 
Jo (a:) assnmes valnes near 1 only for x near 0.) Thns there exists > 0 snch that, for 
X < ^ 0 , we have not only | Jo(ic)| < 1 — a:^/8, bnt also | Jo(2/)| < 1 — a:^/8 for all y > x. We 
have xq < all snfRciently large n. Since I Jo(ra:)| < 1, we then have 




Joirx) Jo{x)'^ X dx 


'Xo 


< xl {i-xl/sy 


< xf exp 


nxr 


= o 




which proves (2.7). To prove (2.8), we observe that since Jo{x) — (2/7ra:)^/^ cos(a: —7r/4) + 
0(l/a:) (see Watson [Wl, p. 195]), there exists snch that | Jq{x) | < 1/a:^/^ for all a: > ^i. 
We have xi > for all snfRciently large n. If in addition we have n > 16, we then have 


Jo(ra:) Jo(a:)"' x dx 


'Xl 


< 


dx 


' Xl 


nl2-l 


X 


= O 


= O 


= o 


1 


n/2-2 
CC 1 


tZ/1 


log*^ n 




which proves (2.8), and completes the proof of (2.6). Thns to prove (2.5), it will snfhce to 

show that 2 

— [ Jo{rx) Jo{x)'^ X dx = —exp ( —— ) + O ( — ) . (2.9) 

27r Jo Tin \ n J \n^ J 

Onr next step will be to estimate the factor Jq{x)^ of the integrand in (2.9) over the 
range x G [0,a:o]- Since Jo(a:) = 1 — a:^/4 + a:^/64 + 0{x^) in this range, log(l + y) = 
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y — y"^ 12 + 0{y^) for x < 1, and expz — 1 + z + z^ 12 + 0{z^) as 2 ; —)■ 0 , we have 


Jq{x)'^ = exp(n log Jo(x)) 


= exp ( n log ( 1 - ^ + ^ + 0{x^] 


= exp 


= exp 


2 4 

nx nx 


4 64 

.2 


nx \ / nx 

= exp (-^ ) exp ( —^ + 0{nx 


+ 0{nx^ 
4 


64 


nx‘^\ nx'^ ^ 


+ 0{nx'^) + 0{n^x^) ) . 


64 


Thus we have 

r^o 


r^Xo 


/ Jo(ra:) Jo(a:)"'x Jx = / Jo{rx) exp 

'0 Jo 

r>X0 


nx 


X dx 


n f nx‘^\ 5 

~ MJ 1 —^ 

/ rxo \ / rxo 

+ 0\ I nx'^dxj+Oij n'^x^dx]. 


Since nxQ = 0{\og^ n/n^) and n'^x^^ = 0{\og^n/n^), both of the last two terms ii 
are 0 (l/n^), so we have 


PXq PXq 

/ Jo{rx) Jo{x)'^ X dx = / Jo(ra;) exp 

'0 Jo 

fXo 


nx 


X dx 


~^jo (“^) ^'dx + O 


To simplify the integrals on the right-hand side of (2.11), we observe that 


'Xo 


nx^\ , 2 

exp I - xdx — -exp 

4 / n 


nx 


X = Xo 


= 0 


1 


n' 


and 


exp 


' XQ 


nx 


x° dx — — \ + 


64 16x2 2x^ 


+ 


= O 




log^ n 
n'^ 


n 


exp 


nx 


x=xo 


( 2 , 10 ) 
1 ( 2 . 10 ) 

.( 2 . 11 ) 
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Since Jo{x) < 1, these estimates imply 


' Xq 


Jo(ra;)exp (- xdx = O (— 


and 


n . / nx^ \ . , ^ / log^ n 

x^ dx = O ‘ 


— / Jo{rx)exp 


64 


’Xq 


n 
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Adding these integrals to the right-hand side of (2.11), we obtain 


I'Xo 


Jo{rx) Jo{x)'^ X dx = / Jo{rx) exp 


nx 


n 


I'Xo 


— / Joirx) exp 


X dx 


nx" \ 5 , „ / 1 


x^ dx + O [ ^ ] . (2.12) 




To evalnate the integrals on the right-hand side of (2.12), we shall nse the integrals 

/ Jo{rx) exp(—ax^) xdx = exp ( — — ) 

Jo V 4ay 


(2.13) 


(see Watson [Wl, p. 393]) and 

/ 1 ^2 ^4 \ / ^ 2 \ 

(2,14) 

which can be obtained from (2.13) by differentiating with respect to a (see for example 
Whittaker and Watson [W2, p. 74]). Applying these integrals to (2.12) with a = n/4 yields 


2 ( r 

Jo{rx) Jo(ic)"' xdx = — exp (- 


n 


1 


2r^ r 

— 5 “ + 




2n'^ 


exp — 


’"'+0' ' 


n 




Since exp(—r^/n) = 0{n) and exp(—r^/n) = 0(n?)^ we obtain 

/ JQ{rx) Jq{x)'^ xdx = — exp I-) + ^ 


n 


n 




Mnltiplying by l/27r yields (2.9), which completes the proof of (2.5). □ 
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3. A Triple Integral 

In this section we shall evalnate a triple integral that gives the coefficient 2/7r^ of the 
n logn term in onr resnlts. We consider the following geometric sitnation. Let Q denote the 
line segment of length R from the origin to {R, 0), where {R, 0) has an m-qnasi-Ganssian 
density. Let S denote the line segment from the origin of nnit length and making an angle 
—TT < \1/ < TT, measnred connterclockwise from Q. Let T denote the line segment from 
{R, 0) of nnit length making angle —tt < $ < tt, measnred clockwise from Q. Let \1/ and 
$ be nniformly distribnted in the interval (—7r,7r), independently of each other and of R. 
Let be the event that the segments S and T intersect at an interior point. 

Proposition 3.1: We have 

Pr|£„l = ^ + o(ij). (3,1) 

Proof: Define the indicator f un ction I{r,'il:,4>) to be 1 (or 0) according as the segments S 
and T do (or do not) intersect at an interior point when R, \1/ and $ assnme the valnes r, 
Ip and (f>, respectively. Then 


= 


(27r) 


I (r, tp, (p) /i?(r) dr dtp d(p. 


' —TV J —TV J 0 


Since (i?, 0) is m-qnasi-Ganssian, we have 

P'-I-E™] = (i)) 

Since I{r,tp,(p) vanishes nnless r < 2, we can rednce the npper limit of the innermost 
integral: 

^ LLl (■^) ° (i)) 

Since exp(a:) = 1-1- 0{x) as a: —)■ 0, we obtain 

LLl 
^L^LLl 

Thns it will snffice to show that 

/ TT P7T ^2 

/ / I{r,tp, (p) r dr dtp d(p — 4. 

-TT J—n J 0 


(3,2) 



Let J denote the 
there is a function g : 


triple integral in (3.2). We shall show that J 
(—7r,7r) X (— 7r,7r) —)■ [0,2] such that 


4. It is clear that 




1, if r < 

0, if r > p('0, (j)). 


Thus 


J = 


I^TV P7V 

— TT J — TV J 0 

/ TV pTV 

-TV J —TV 


r dr dip d4> 
Q{'ip, dip dp. 


It is clear that g{p, p) vanishes unless p and p have the same sign, and that it is unchanged 
if this common sign is reversed; thus 


J = 


p('0, PY dp dp. 


'0 Jo 


It is clear that g{p,p) vanishes if both p and p are obtuse (that is, belong to (7r/2,7r)). 
Thus we may break the integral into three parts, according as p^ p, or neither is obtuse: 


J = 


r*7r pTV 


fTv/2 Jo 


/2 


g{p^ PY dp dp + 


r/2 


TV A rTV 


'O Jtv/2 


g{p, p)‘^ dp dp + 


r/2 p-n-/2 


g{p, PY dp dp. 


'0 Jo 


Since g{p, p) is unchanged by the exchange of p and p^ the second term equals the hrst, 
so 


r*7r pTV 


/2 


r/2 /•7r/2 


J = 2 


I I g{p, p)'^ dp dp + / / g{p,p)‘^ dp dp. 

J7^/2 Jo Jo Jo 

Furthermore, since g{p, p) = 0 unless p < — p, we can restrict the range of the inner 

integral in the hrst term, so 


f*TV nTV — i 


J = 2 


'7r/2 Jo 


r-'K/2 r--K/2 

g{p^ PY dp dp + / / g{p, p)‘^ dp dp. 

Jo Jo 


Again using the symmetry between p and p^ we may restrict the range of the inner integral 
in the second term to p < p, and double the resulting term, so 


r*7r pTV — 0 


J = 2 


= 2 


g{p, p)‘^ dp dp 2 


f>7r/2 p(j} 


g{pj PY dp dp 


'7r/2 Jo 

12, ^TT —P 

'0 J-ij} 


’0 Jo 


g{p, PY dp dp. 


(3.3) 


9 



For the range of integration in (3.3), a little trigonometry shows that 


^('0, (/)) = sin((/) + '0) cosec (f). 


(3.4) 


To see this, we may imagine starting with r = 2 and then redncing r nntil S and T intersect, 
which happens when r == p('0, (p). Then in the resnlting triangle, the side opposite (p has 
length 1, while is the length of the side opposite the angle % — (p — 'ip. Applying 

the law of sines, and nsing the fact that sin(7r — (p — tp) — sin((/) + tp), we obtain (3.4). 
Snbstitnting (3.4) into (3.3), we obtain 

^7r/2 pTZ — 'ljj 

J = 2 / {(p + Ip) cosec^(p d(p dip. (3.5) 

Jo Jtl} 

From the antiderivative 

J sin^((/) + Ip) cosec'^(pd(p — (pcos^ ip — {(p + cotan (/>) sin^ ip + logsin(/)sin(2'0). 


we obtain 


(•-K — tp 


sm' 


I p 


'{(p + Ip) cosec^cp d(p = (tt 


2ip) cos{2ip) + sin(2'0). 


Snbstitnting this valne for the inner integral in (3.5) yields 


^7r/2 

J = 2 (tt — 2ip) cos(2'0) + sin(2'0) dip, 

Jo 

and evalnating this integral we obtain J = 4 as desired. This completes the proof of (3.2). 

□ 

We observe for fntnre reference that Proposition 3.1 continnes to hold if a hxed con¬ 
stant displacement C is added to the m-qnasi-Ganssian step from origin to (i?, 0), be- 
canse an m-qnasi-Ganssian distribntion assigns densities that differ at most by a factor 
(l -|- 0(l/m)) to all points within a bonnded distance of the origin. (The constant in the 
O-term may now depend on C.) 


4. The Mean 

We begin with walks. Let the random variable Kn denote the nnmber of self¬ 
intersections in a two-dimensional eqnilateral isotropic random walk. In this section, we 
shall derive the estimate 

2 

Ex[iF^] = — nlogn-I-0(n). (4.1) 
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Let Lij denote the event that the i-th segment [Xi_i^Xj\ intersects the j-th segment 
[Xj-i,Xj\ at an interior point of each segment. We shall also write j for the indicator 
fnnction of that event, assnming the valne 1 when the event occnrs, and the valne 0 when 
it does not, so that Ex[Li^j] = Pr[Li^j]. By the linearity of expectation, we have 

Ex|/-f„]= 5^ Pr|L.,,l. (4.2) 

Onr problem is now to estimate Pr[L^ j]. 

It is clear that Pr[Li^j] depends only on the nnmber a — j — i — 1 of steps between the 
end of the f-th step and the beginning of the j-th step, and that it is zero nnless a > 1. 
Fnrthermore, Li^j has the same probability as the event Ea dehned in the preceding section 
((i?, 0) is the polar representation of the snm of the {i + l)-st throngh the (j — l)-st steps, 
S is the f-th step and T is the j-th step). Let b = n — j — 1, as shown in Fignre 4.1. 

/ \ 


Fignre 4.1 


By Proposition 3.1, we have 


Pr[L, 



Snmming over the possible valnes of a and b, we obtain 


Ex[iL^] 


0<6<n—3 l<a<n—2—6 

E 

0<6<n—3 l<a<n—2—6 
2 

nlogn + 0(n), 

TT^ 



a 


+ 0 



becanse 


and 


Thns (4.1) is verihed. 


- = logm + 0(1) 

l<m 


logn = m logm — m + O(logm). 

1 <i?<m 
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We turn now to polygons. Let the random variable K'^ denote the number of self¬ 
intersections in a two-dimensional equilateral isotopic random polygon. We shall derive 
the estimate 

Ex[iL^] =-^ nlogn-F 0(n). (4.3) 

For polygons, with their circular symmetry, there is only one topological conhguration 
for a self-intersection, as depicted in Figure 4.2. 




Figure 4.2 


The two “stretches” (indicated by solid lines) in Figure 4.2 are topologically equivalent, 
and drawing wither of them at the bottom results in a picture like Figure 4.3. 




Figure 4.3 

We shall analyze this case with arguments similar to those we used for Figure 4.1. But for 
polygons we have the constraint a-|-6 = n — 2, so we shall not sum over all combinations 
of a and b. Since at least one of the two stretches in Figure 4.2 must have length at least 
(n — 2)/2 > n/4 (assuming n > 4), we may choose such a long stretch to draw at the 
bottom, so that we have b > n/4. This choice implies a < {n — 2)/2. Thus we shall sum 
Pr[Lij] over 1 < a < (n — 2)/2 and then include an extra factor of n to the sum, to take 
account of the n possible positions in which this hgure might appear around the polygon. 

By circular symmetry, Pr[Lij] depends only on a = j — i — l and on n. To determine 
this dependence, we must reconsider the situation described in the preceding section, 
conditioning on the event that T and S are the hrst and last steps of an equilateral 
isotropic random walk of b + 2 steps from [R, 0) back to the origin. This introduces two 
complicating effects. First, the density of {R, 0) is no longer a-quasi-Gaussian. We shall 
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see, however, that is it quasi-Gaussian with a smaller parameter. Second, \1/ and $ are no 
longer independent and nniformly distribnted. We shall see, however, that they are close 
to being so. 

We begin by reconsidering the density of {R, 0). It is well known that the density of 
an n-Ganssian step, conditioned on the event that a farther independent rc-Ganssian step 
retnrns to the origin, is {v || rc)-Ganssian, where {v || w) — vw/{v + w) is the “harmonic 
snm” of V and w. (The variances of “parallel” Ganssian steps combine like resistance in 
parallel.) To derive this resnlt, we have only to mnltiply the densities and integrate the 
resnlt to renormalize. For qnasi-Ganssian steps, we mnst add error terms 0{l/v‘^) and 
0{l/w‘^). Bnt since 

1 1 1 v'^ + w'^ ^ (v + w)'^ 1 

v2 yj2 y2 II yj2 v‘^ w‘^ ~ (vw)"^ (v || ’ 

these error terms can be combined into a single one of order 0(l/(v || rc)^). Since this 
is the error term for an (v || rc)-qnasi-Ganssian density, we conclnde that the density of 
(i?, 0) for polygons is (a || (6 + 2))-qnasi-Ganssian. 

We observe for fntnre reference that if hxed constant displacements C and D are 
added to the v- and rc-qnasi-Ganssian steps considered above, their parallel connection is 
then (v II rc)-qnasi-Ganssian with an added constant displacement (vC + wD)/(v + w). 

We tnrn next to the dependence of T and $. Since I{r, '0, (p) vanishes nnless r < 2, we 
assnme that R is some valne r < 2. In this case the other endpoints of the nnit segments 
S and T are at distance at most 4. Snppose in addition that T assnmes some valne tp 
(thns determining the position of the other endpoint s of S'). An (6 + 2)-qnasi-Ganssian 
density assigns to all points within distance at most 4 of s valnes that differ at most by a 
factor 1 + 0(1/6). Thns we have = (l/27r)(l + 0(l/6)) when r < 2. By 

the same argnment, we have i?=r('0) = (l/27r)(l + 0(l/6)) when r < 2. Thns we 

have /ii-,<i.|i?=r('0, (p) — (l/27r)^(l + 0(l/6)) when r < 2. 

Snbstitnting this resnlt for the factor of l/(27r)^ that represents the nniform joint 
distribntion of 4/ and $ in the preceding section, and changing the a-qnasi-Ganssian dis- 
tribntion of (i?, 0) to an (a || (6 + 2))-qnasi-Ganssian distribntion (as indicated in the 
preceding paragraph), we conclnde that 


PAHj] = 


nHa II (6 + 2)) 
2 „ / 1_ 

2 


+ 0 


1 


(a II (6 + 2))^ 


1 + 0 


TT^a 


+ 0 




13 



where we have used the fact that 6 > n/4 to simplify the parallel combinations involving 6, 
together with 1/n = 0(l/a). Summing over 1 < a < n/2 and multiplying by an additional 
factor of n then yields (4.3). 

5. The Variance 

We again begin with walks. Recall that the random variable denotes the number of 
self-intersections in a two-dimensional equilateral isotropic random walk. We shall derive 
the estimate 

Var[Rr„] = 0(n^ logn) (5.1) 

We shall use the formula 

yar[Kn]= ^ CoYar[Lij, Lk,i], (5.2) 

1 < i < j < n 
1 < fc < i < n 

where 

Covar[Lij, Lk,i] = Pr[Tij, Lk,i] - Pr[Ti,j] ■ Pr[Lfc,z]. (5.3) 

We shall suppose to begin with that i, j, k and I all distinct. In fact we shall suppose 
that they differ pairwise by at least 2. We consider hrst the terms with i<j<k<l, as 
depicted in Figure 5.1. 


1- i - j - k - I - n 

Figure 5.1 

For these terms, Lij and Lk,i are independent, so Pr[Lij, Lk,i] = Pr[Lij] ■ Pr[Lk,i] and 
Covar[Li j, Lfc ;] vanishes. This observation of course also applies to terms with k < I < 
i < j. 

We consider next the terms with k < i < j < I, as depicted in Figure 5.2. 
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n 


Figure 5.2 

For these terms, we shall use the upper bound 

Covar[Z/j j^ Pr[Fj^j] Pv^Lj^ i | 


(5.4) 


(Since Var[iFn] is non-negative, we can upper-bound it by upper-bounding each term in 
the sum (5.2).) We shall dehne a — i — k — 1, b = j — i — 1, c = I — j — 1 and d = n — I — 1. 
Then we must sum over terms with a,b,c,d> 1 and a-|-6-|-c-|-(i<n — 4. By Proposition 
3.1, 


but we shall weaken this estimate to 


Pr[L.,,l = O 


Suppose now that the event Lij has occurred in some particular way. Then the displace¬ 
ment from the end of the /c-th step to the beginning of the /-th step is given by an a-step 
equilateral isotropic random walk, followed by a constant step of length at most 2 (from 
the beginning of the t-th step to the end of the j-th step), followed by a c-step equilateral 
isotropic random walk. By the commutativity of addition, this is equivalent to an (a -|- c)- 
step equilateral isotropic random walk, followed by the constant step. By the argument in 
the proof of Proposition 3.1, we obtain 


but we shall weaken this estimate to 

Thus 


7r2(a -|- c) 


L,,,] = O 


+ 0 


1 


(a -F cy 


1 


Covar[Lij,Lfc,i] = ^ Q 


a -|- c 

1 1 


a -|- c 


We shall extend the range of summation to all 1 < a,b, c, d < n, which can only increase 
the result. Summing over 1 < 6 < n gives a factor of O(logn), because 


- = logm-I-0(1). 


1<V<7 
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The sum over a and c contributes a factor of 0{n), because 


— - —=0(m). 

^ ^ v+w ^ ’ 

And the sum over d of course contributes a factor of 0{n). Thus the quadruple sum over 
a, b, c and d is 0(n^ logn). This estimate of course also applies to the sum of terms with 
k <l < j <1. 

We consider next the terms with i<k<j<l, a,s depicted below. 

/ \ 

/ \ 

/ ^ r H 

1- i - k - j - I - n 

« \ 6 ■’ / 

\ /• 

\ / 

Figure 5.2 

For these terms we shall again use the upper bound (5.4), and the variables a, 6, c and d 
as dehned before. By Proposition 3.1 

^ 7r2(a+6) ^ ^ (yrr#) ’ 

but we shall weaken this estimate to 

We claim that 

I (hT^) ■ 

To see this, suppose that the event Lij has occurred in some particular way. Consider 
the displacement A from the beginning of the j-th step to the end of the /c-th step. This 
displacement consist of a constant step of distance at most 2 (from the beginning of the 
j-th step to the end of the i-th step), followed by an equilateral isotropic random walk of 
(a + 1) steps, conditioned on the event that a further 6-step equilateral isotropic random 
walk returns to the beginning of the j-th step. It follows that A is a constant (of length 
at most 2) followed by a (a || 6)-quasi-Gaussian step. Furthermore, the angular density 
of the /c-th step is within a factor (1 -f- 0(l/(a || 6))) of uniform. Consider next the the 
displacement B from the beginning of the j-th step to the beginning of the l-th step. 
This displacement consists of a constant step of length 1 (the j-th step) followed by a 
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c-step equilateral isotropic random walk, and is thus a constant step of length 1 followed 
by ac-quasi-Gaussian step. Furthermore, the angular density of the /-th step is uniform. 
We must now consider the total displacement A-\- B from the end of the /c-th step to the 
beginning of the /-th step. Apart from the constant steps, we have a (a || 6)-quasi-Gaussian 
step followed by a c-quasi-Gaussian step. It is well known that the sum of a u-Gaussian 
step and a rc-Gaussian step is a (u-|-r(;)-Gaussian step. (The variances of “series” Gaussian 
steps combine like resistance in series.) For quasi-Gaussian steps, however, we must add 
error terms 0{l/{v -\- w)v) and 0{l/{v + w)w). These error terms can be combined to 
the single error term 0{l/{v -t- w){v || re)). Thus a sum of quasi-Gaussian steps is not 
necessarily quasi-Gaussian (that would require an error term 0{l/{v + w)‘^)), but it differs 
from quasi-Gaussian only in having the larger error term 0{l/{v -|- w){v || w)). Applying 
this result to the problem at hand, we conclude that 

Prl-bfc,/ I Tij] II _l_ c) ^ ^ ((a II h) -I- c)(a || b || c) 

but we shall weaken this estimate to (5.6). 

Gombining (5.4), (5.5) and (5.6), we obtain 

Govar[L,,j , Lk,i] = 0 ( ^ ) 

\a + b (a II o) -|- cy 

= o(— —^- - 

\ab + ac + be 

Since the arithmetic mean (ab + ac + be)/3 exceeds the corresponding geometric mean 
(a6c)^/^, we obtain 

Govar[L,,„Lfc,^] = O (^ ^ 2 / 352 / 3 ^ 2 / 3 ) ' 

We sum as before over all 1 < a,b,c,d < n, The sums over each of a, b and c contribute a 
factor of 0(n^/^), because 

l<4;<n 

And the sum over d of course contributes a factor of n. Thus the quadruple sum over 
a, b, c and d is 0(v?). This estimate of course also applies to the sum over terms with 
k < i < I < j. 

At this point, we have considered all terms in (5.2) in which i, j, k and I pairwise 
differ by at least 2. The remaining terms are much easier to deal with, and we shall give 
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the estimates explicitly. We merely state that these terms contribute only 0(n(logn)^) to 
the sum. Since all of these contributions are O(n^logn), we have verihed (5.1). 

We turn now to polygons. Recall that the random variable K'^ denotes the number 
of self-intersections in a two-dimensional equilateral isotropic random polygon. We shall 
derive the estimate 

Vdj:[K'^ = 0{n^\ogn) (5.7) 

For polygons, with their circular symmetry, there are only two topologically distinct 
conhgurations for a pair of self-intersections, as depicted in Figures 5.4 and 5.5. 


/ \ 

/ \ 

/ • \ 



Figure 5.4 






Figure 5.5 

The four stretches in Figure 5.4 are all topologically equivalent, and drawing any of them 
at the bottom results in a picture like Figure 5.6. 
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/ \ 



Figure 5.6 

We shall analyze this case with arguments similar to those we used for Figure 5.3. But 
for polygons we have the constraint a + b + c + d = n — 4, so we shall not sum over all 
combinations of 1 < a,b, c, d < n. Rather, we shall take d = n — 4 — a — b — c, sum over all 
combinations of 1 < a, 6, c < n, and then include an extra factor of n to the sum, to take 
account of the n possible positions in which this hgure might appear around the polygon. 
Furthermore, since at least one of the four stretches in Figure 5.4 must have length at 
least (n — 4)/4 > n/8 (assuming n > 8), we may choose such a long stretch to draw at the 
bottom, so that we have d > n/8. 

By the same arguments as we used for Figure 5.3, we have 

^ ((a + 6) II (c + d)) 

and 

Thus 

Covar|L,„ L,,l = O 

a + b + c + d 
abc + abd + acd + bed 

Again using an inequality {abc + abd + acd + bed)/4 > (abed)^^^ between arithmetic and 
geometric means, together with a + b + c + d = n — 4, we obtain 

Covar[L,,„ L,J = O ■ 
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We sum this expression over all 1 < a,b,c < n. The sums over each of a, b and c each 
contribute a factor of because 

l<b’<n 

These contributions are cancelled by the factor of > (n/8)^/^ in the denominator. 
This leaves just the factor of n in the numerator. Multiplying by another factor of n to 
account for the positions in which this conhguration may appear around the polygon, we 
see that all the terms depicted in Figure 5.5 contribute 0{v?) to the variance. 

There are two topologically different kinds of stretches in Figure 5.5. If we draw it 
with one of its horizontal stretches at the bottom, we obtain a picture like Figure 5.7, 
while if we draw it with one of its vertical stretches at the bottom, we obtain a picture like 
Figure 5.8. 



Figure 5.7 



Figure 5.8 

We shall choose between these alternatives to ensure that d > n/8, as before. 

We shall analyze Figure 5.7 with arguments similar to those we used for Figure 5.2. 
We shall again take d = n — 4 — a — b — c, sum over all combinations of 1 < a,b,c < n, and 
then include an extra factor of n to the sum, to take account of the n possible positions 
in which this hgure might appear around the polygon. 
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By the same arguments as we used for Figure 5.2, we have 


Pi'|L.,,] = O 


6 II (a + c + d) 


and 




^r[Lk,i I Li^j] = O 


1 


(a + c) II d 


= O 


1 


a + c 


5 


where we have used the fact that d > n/8 to simplify the parallel combinations involving 
d. Thus 


Covar[Lij,Lfc,z] 


O 


1 1 


6 a + c 


As in the analysis of Figure 5.2, the sum over 1 < 6 < n contributes a factor of 0(log n) and 
the sum over 1 < a, c < n contributes a factor of 0{n). Multiplying by another factor of n 
to account for the positions in which this conhguration may appear around the polygon, 
we see that all the terms depicted in Figure 5.7 contribute O(n^logn) to the variance. 

Finally, we consider Figure 5.8. This case is most similar to that of Figure 5.1. But 
where Lj ^ and ^ were independent in Figure 5.1, the stretch of length d introduces a 
dependence in Figure 5.8. But since d > n/8, this dependence is weak. We shall need to 
exploit cancellation between the terms in (5.3), rewriting it in the form 


Covar[L,j,L(._i] 


Pi-liijl 





Pr[L 


k,i 


(5.8) 


We have 


Pi-li-jl = O 



(6 + c + d) 


= o 



•) 


(5.9) 
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and 




TT^ (C 


TT^C 


+ 0 


(6 + d)) 
1 


+ 0 


{b+d)y 


Pr[Lk,i] = 


+ 0 


n^{c\\ (a + b + d)) ^ (c || (a + 6 + d)) 


2 

TT^C \ ^ ’ 


(5.10) 


(5.11) 


where we have used the fact that d > n/8 to simplify the parallel combinations involving 
d. Substituting (5.9), (5.10) and (5.11) in (5.8), we obtain 

Covar[Lij, Lk,i] = O 

The sum over 1 < a < n contributes a factor of O(logn), the sum over 1 < 6 < n 
contributes a factor of 0(n), and the sum over 1 < c < n contributes a factor of 0(1). 
Multiplying by another factor of n to account for the positions in which this conhguration 
may appear around the polygon, we see that all the terms depicted in Figure 5.8 contribute 
0(n^ logn) to the variance. 

As was the case for walks, terms in which i, j, k and I do not differ pairwise by at 
least 2 are much easier to deal with, and contribute only 0(n(logn)^) to the variance. 
Since all of these contributions are O(n^logn), we have verihed (5.7). 



6. Conclusion 

We have shown that the mean number of self-intersections for both random walks and 
random polygons in two dimensions with isotropic equilateral steps is (2/7r^)nlogn-|-0(n). 
We have also shown that the variance is 0(n^ logn) for both walks and polygons. We have 
not determined the asymptotic behavior of the variance more exactly, because our result 
suffices to show concentration around the mean. It remains an open problem to determine 
the order of magnitude of (or, more ambitiously, an asymptotic formula for) the variance. 

It would also be of interest to extend the results of this paper to Gaussian random 
walks and polygons, or to the projections onto two dimensions of three-dimensional equi¬ 
lateral walks and polygons, thereby establishing concentration about the mean for these 
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models. For these problems, the triple integral evaluated in Section 3 would be replaced by 
a quintuple integral, and the evaluations of various conditional probabilities would become 
more complicated, but the general strategy of our proofs should still be applicable. 
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